Probably Approximately Metric-Fair Learning
نویسندگان
چکیده
We study fairness in machine learning. A learning algorithm, given a training set drawn from an underlying population, learns a classifier that will be used to make decisions about individuals. The concern is that this classifier’s decisions might be discriminatory, favoring certain subpopulations over others. The seminal work of Dwork et al. [ITCS 2012] introduced fairness through awareness, positing that a fair classifier should treat similar individuals similarly. Similarity between individuals is measured by a task-specific similarity metric. In the context of machine learning, however, this fairness notion faces serious difficulties, as it does not generalize and can be computationally intractable. We introduce a relaxed notion of approximate metric-fairness, which allows a small fairness error: for a random pair of individuals sampled from the population, with all but a small probability of error, if they are similar then they are treated similarly. In particular, this provides discrimination-protections to every subpopulation that is not too small. We show that approximate metric-fairness does generalize from a training set to the underlying population, and we leverage these generalization guarantees to construct polynomial-time learning algorithms that achieve competitive accuracy subject to fairness constraints. [email protected]. Research supported by the ISRAEL SCIENCE FOUNDATION (grant No. 5219/17). [email protected]. Research supported by the ISRAEL SCIENCE FOUNDATION (grant No. 5219/17).
منابع مشابه
A Metric Entropy Bound is Not Su cient for
We prove by means of a counterexample that it is not suucient, for PAC learning under a class of distributions, to have a uniform bound on the metric entropy of the class of concepts to be learned. This settles a conjecture of Benedek and Itai.
متن کاملPAC-Learning for Energy-based Models
In this thesis we prove that probably approximately correct (PAC) learning is guaranteed for the framework of energy-based models. Starting from the very basic inequalities, we establish our theory based on the existence of metric between hypothesis, to which the energy function is Lipschitz continuous. The result of the theory provides a new scheme of regularization called central regularizati...
متن کاملOnline Learning with an Unknown Fairness Metric
We consider the problem of online learning in the linear contextual bandits setting, but in which there are also strong individual fairness constraints governed by an unknown similarity metric. These constraints demand that we select similar actions or individuals with approximately equal probability [Dwork et al., 2012], which may be at odds with optimizing reward, thus modeling settings where...
متن کاملLearnability and the doubling dimension
Given a set of classifiers and a probability distribution over their domain, one can define a metric by taking the distance between a pair of classifiers to be the probability that they classify a random item differently. We prove bounds on the sample complexity of PAC learning in terms of the doubling dimension of this metric. These bounds imply known bounds on the sample complexity of learnin...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کامل